reinforcement learning stability AI News List

predict.info — Premium Domain For Sale Domain only: USD 200,000. Prediction platform technology priced separately. predict.info

Inquire

AI News List

List of AI News about reinforcement learning stability

Time	Details
2025-10-24 14:38	Inclusion AI Unveils Ring-1T: First 1 Trillion-Parameter Open Reasoning Model with Breakthroughs in AI Scalability According to @godofprompt on Twitter, Inclusion AI has launched Ring-1T, the first open-source 1 trillion-parameter Mixture-of-Experts reasoning model, marking a milestone in AI scalability and reasoning power (source: @godofprompt, Oct 24, 2025). Unlike traditional predictive models, Ring-1T is designed to 'think' by leveraging advanced reasoning capabilities. Key innovations include IcePop, which addresses reinforcement learning instability by clipping noisy gradients, and C3PO++, a rollout engine that accelerates long reasoning traces by 2.5 times. The ASystem framework enables the synchronization of all trillion parameters in under 10 seconds, facilitating distributed RL at unprecedented scale. Benchmarks show Ring-1T achieves 93.4 on AIME-25, 86.7 on HMMT-25, 2088 on Codeforces, and a silver-medal level on IMO-2025, surpassing previous open models in complex reasoning tasks. This breakthrough opens significant business opportunities in AI-driven problem-solving, advanced analytics, and enterprise automation, particularly in sectors requiring high-level cognitive abilities. The open weights further democratize access, enabling both startups and enterprises to build next-generation AI applications with state-of-the-art reasoning performance (source: @godofprompt, Oct 24, 2025). Source

Time

Details

2025-10-24
14:38

Inclusion AI Unveils Ring-1T: First 1 Trillion-Parameter Open Reasoning Model with Breakthroughs in AI Scalability

According to @godofprompt on Twitter, Inclusion AI has launched Ring-1T, the first open-source 1 trillion-parameter Mixture-of-Experts reasoning model, marking a milestone in AI scalability and reasoning power (source: @godofprompt, Oct 24, 2025). Unlike traditional predictive models, Ring-1T is designed to 'think' by leveraging advanced reasoning capabilities. Key innovations include IcePop, which addresses reinforcement learning instability by clipping noisy gradients, and C3PO++, a rollout engine that accelerates long reasoning traces by 2.5 times. The ASystem framework enables the synchronization of all trillion parameters in under 10 seconds, facilitating distributed RL at unprecedented scale. Benchmarks show Ring-1T achieves 93.4 on AIME-25, 86.7 on HMMT-25, 2088 on Codeforces, and a silver-medal level on IMO-2025, surpassing previous open models in complex reasoning tasks. This breakthrough opens significant business opportunities in AI-driven problem-solving, advanced analytics, and enterprise automation, particularly in sectors requiring high-level cognitive abilities. The open weights further democratize access, enabling both startups and enterprises to build next-generation AI applications with state-of-the-art reasoning performance (source: @godofprompt, Oct 24, 2025).

Source